Boosting as a Regularized Path to a Maximum Margin Classifier
نویسندگان
چکیده
In this paper we study boosting methods from a new perspective. We build on recent work by Efron et al. to show that boosting approximately (and in some cases exactly) minimizes its loss criterion with an l1 constraint on the coefficient vector. This helps understand the success of boosting with early stopping as regularized fitting of the loss criterion. For the two most commonly used criteria (exponential and binomial loglikelihood), we further show that as the constraint is relaxed — or equivalently as the boosting iterations proceed — the solution converges (in the separable case) to an “l1-optimal” separating hyper-plane. We prove that this l1-optimal separating hyper-plane has the property of maximizing the minimal l1-margin of the training data, as defined in the boosting literature. An interesting fundamental similarity between boosting and kernel support vector machines emerges, as both can be described as methods for regularized optimization in high-dimensional predictor space, utilizing a computational trick to make the calculation practical, and converging to margin-maximizing solutions. While this statement describes SVMs exactly, it applies to boosting only approximately. Key-words: Boosting, regularized optimization, support vector machines, margin maximization
منابع مشابه
Margins, Shrinkage, and Boosting
This manuscript shows that AdaBoost and its immediate variants can produce approximate maximum margin classifiers simply by scaling step size choices with a fixed small constant. In this way, when the unscaled step size is an optimal choice, these results provide guarantees for Friedman’s empirically successful “shrinkage” procedure for gradient boosting (Friedman, 2000). Guarantees are also pr...
متن کاملSpeed and Sparsity of Regularized Boosting
Boosting algorithms with l1-regularization are of interest because l1 regularization leads to sparser composite classifiers. Moreover, Rosset et al. have shown that for separable data, standard lpregularized loss minimization results in a margin maximizing classifier in the limit as regularization is relaxed. For the case p = 1, we extend these results by obtaining explicit convergence bounds o...
متن کاملSequential Maximum Margin Classifiers for Partially Labeled Data
In many real-world applications, data is not collected as one batch, but sequentially over time, and often it is not possible or desirable to wait until the data is completely gathered before analyzing it. Thus, we propose a framework to sequentially update a maximum margin classifier by taking advantage of the Maximum Entropy Discrimination principle. Our maximum margin classifier allows for a...
متن کاملBoosting Based on a Smooth Margin
We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can maximize in order to produce a maximum margin classifier. Our first algorithm is simply coordinate ascent on this function, involving a line se...
متن کاملRegularizing AdaBoost
Boosting methods maximize a hard classiication margin and are known as powerful techniques that do not exhibit overrtting for low noise cases. Also for noisy data boosting will try to enforce a hard margin and thereby give too much weight to outliers, which then leads to the dilemma of non-smooth ts and overrtting. Therefore we propose three algorithms to allow for soft margin classiication by ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 5 شماره
صفحات -
تاریخ انتشار 2004